Exploration by Maximizing Renyi Entropy for Reward-Free RL Framework

نویسندگان

چکیده

Exploration is essential for reinforcement learning (RL). To face the challenges of exploration, we consider a reward-free RL framework that completely separates exploration from exploitation and brings new algorithms. In phase, agent learns an exploratory policy by interacting with environment collects dataset transitions executing policy. planning computes good any reward function based on without further environment. This suitable meta setting where there are many functions interest. propose to maximize Renyi entropy over state-action space justify this objective theoretically. The success using as results its encouragement explore hard-to-reach state-actions. We deduce gradient formulation design practical algorithm can deal complex environments. solve policies given arbitrary batch algorithm. Empirically, show our effective sample efficient, in superior phase.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Renyi Entropy Estimation Revisited

We revisit the problem of estimating entropy of discrete distributions from independent samples, studied recently by Acharya, Orlitsky, Suresh and Tyagi (SODA 2015), improving their upper and lower bounds on the necessary sample size n. For estimating Renyi entropy of order α, up to constant accuracy and error probability, we show the following Upper bounds n = O(1) · 2(1− 1 α )Hα for integer α...

متن کامل

Shannon Entropy , Renyi Entropy , and Information

This memo contains proofs that the Shannon entropy is the limiting case of both the Renyi entropy and the Tsallis entropy, or information. These results are also confirmed experimentally. We conclude with some general observations on the utility of entropy measures. A brief summary of the origins of the concept of physical entropy are provided in an appendix.

متن کامل

Shannon and Renyi Entropy of Wavelets

This paper reports a new reading for wavelets, which is based on the classical ’De Broglie’ principle. The waveparticle duality principle is adapted to wavelets. Every continuous basic wavelet is associated with a proper probability density, allowing defining the Shannon entropy of a wavelet. Further entropy definitions are considered, such as Jumarie or Renyi entropy of wavelets. We proved tha...

متن کامل

Shannon Entropy Versus Renyi Entropy from a Cryptographic Viewpoint

We provide a new inequality that links two important entropy notions: Shannon Entropy H1 and collision entropy H2. Our formula gives the worst possible amount of collision entropy in a probability distribution, when its Shannon Entropy is fixed. While in practice it is easier to evaluate Shannon entropy than other entropy notions, it is well known in folklore that it does not provide a good est...

متن کامل

A Comprehensive Comparison of Shannon Entropy and Smooth Renyi Entropy

We provide a new result that links two crucial entropy notions: Shannon Entropy H1 and collision entropy H2. Our formula gives the worst possible amount of collision entropy in a probability distribution, when its Shannon Entropy is fixed. Our results and techniques used in the proof immediately imply many quantitatively tight separations between Shannon and smooth Renyi entropy, which were pre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence

سال: 2021

ISSN: ['2159-5399', '2374-3468']

DOI: https://doi.org/10.1609/aaai.v35i12.17297